1. A 100 billion parameter model requires 266 8-card A100 servers with a single card computing efficiency of 44%. 2. To improve the performance of large models, it is necessary to optimize aspects such as frameworks, IO, and communication. 3. Compared to GPT-4, domestic large models have discrepancies in computing power, algorithms, and data.